Analytical Review of the News Data Classification Methods with Multivariate Classification Attributes
نویسنده
چکیده
-The new classification has been emerged as the important sub-branch of the data mining. A lot of work has been already done on the news classification with variety of classifiers and feature descriptors. A number of news classification projects are working on the real-time systems in existence today. The news classification is the important part of the online news portals. The online news portals are rising every year, and adding more users to the news portals. The news classification is the branch of text classification or text mining. The researchers have already done a lot of work on the text classification models with different approaches. The news works has to be classified in the form of various categories such as sports, political, technology, business, science, health, regional and many other similar categories. The researchers have already worked with many supervised and unsupervised methods for the purpose of news classification. The supervised models have been found more efficient for the purpose of news classification. The major goal of the news classification research is to improve the accuracy while decreasing the elapsed time. Our news classification models purposes the use of k-means and lexicon analysis of the news data with nearest neighbor algorithm for the news classification. The k-means algorithm is the clustering algorithm and used primarily to produce the text data clusters with the important information. Then the lexicon analysis would be performed over the given text data and then final classification of the news is done using k-nearest neighbor. The results would be obtained in the form of the parameters of accuracy, elapsed time, etc.
منابع مشابه
Support Vector Machine Based Facies Classification Using Seismic Attributes in an Oil Field of Iran
Seismic facies analysis (SFA) aims to classify similar seismic traces based on amplitude, phase, frequency, and other seismic attributes. SFA has proven useful in interpreting seismic data, allowing significant information on subsurface geological structures to be extracted. While facies analysis has been widely investigated through unsupervised-classification-based studies, there are few cases...
متن کاملA New Document Embedding Method for News Classification
Abstract- Text classification is one of the main tasks of natural language processing (NLP). In this task, documents are classified into pre-defined categories. There is lots of news spreading on the web. A text classifier can categorize news automatically and this facilitates and accelerates access to the news. The first step in text classification is to represent documents in a suitable way t...
متن کاملArabic News Articles Classification Using Vectorized-Cosine Based on Seed Documents
Besides for its own merits, text classification (TC) has become a cornerstone in many applications. Work presented here is part of and a pre-requisite for a project we have overtaken to create a corpus for the Arabic text process. It is an attempt to create modules automatically that would help speed up the process of classification for any text categorization task. It also serves as a tool for...
متن کاملA New Framework for Distributed Multivariate Feature Selection
Feature selection is considered as an important issue in classification domain. Selecting a good feature through maximum relevance criterion to class label and minimum redundancy among features affect improving the classification accuracy. However, most current feature selection algorithms just work with the centralized methods. In this paper, we suggest a distributed version of the mRMR featu...
متن کاملA Framework for Optimal Attribute Evaluation and Selection in Hesitant Fuzzy Environment Based on Enhanced Ordered Weighted Entropy Approach for Medical Dataset
Background: In this paper, a generic hesitant fuzzy set (HFS) model for clustering various ECG beats according to weights of attributes is proposed. A comprehensive review of the electrocardiogram signal classification and segmentation methodologies indicates that algorithms which are able to effectively handle the nonstationary and uncertainty of the signals should be used for ECG analysis. Ex...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2016